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Abstract —Indoor localization in multi-floor buildings is an 
important research problem. Finding the correct floor, in a 
fast and efficient manner, in a shopping mall or an unknown 
university building can save the users’ search time and can 
enable a myriad of Location Based Services in the future. 
One of the most widely spread techniques for floor estimation 
in multi-floor buildings is the fingerprinting-based localization 
using Received Signal Strength (RSS) measurements coming from 
indoor networks, such as WLAN and BLE (Bluetooth Low 
Energy). The clear advantage of RSS-based floor estimation is 
its ease of implementation on a multitude of mobile devices at 
the Application Programming Interface (API) level, because RSS 
values are directly accessible through API interface. However, 
the downside of a fingerprinting approach, especially for large- 
scale floor estimation and positioning solutions, is their need 
to store and transmit a huge amount of fingerprinting data. 
The problem becomes more severe when the localization is 
intended to be done on mobile devices (smart phones, tablets, 
etc.) which have limited memory, power, and computational 
resources. An alternative floor estimation method, which has 
lower complexity and is faster than the fingerprinting is the 
Weighted Centroid Localization (WCL) method. The trade-off 
is however paid in terms of a lower accuracy than the one 
obtained with traditional fingerprinting with Nearest Neighbour 
(NN) estimates. In this paper a novel k-means -based method for 
floor estimation via fingerprint clustering of WiFi and various 
other positioning sensor outputs is introduced. Our method 
achieves a floor estimation accuracy close to the one with NN 
fingerprinting, while significantly improves the complexity and 
the speed of the floor detection algorithm. The decrease in the 
database size is achieved through storing and transmitting only 
the cluster heads (CH’s) and their corresponding floor labels. The 
performance of the proposed methods is evaluated using real- 
life indoor measurements taken from four multi-floor buildings. 
The numerical results show that the proposed k-means -based 
method offers an excellent trade-off between the complexity and 
performance. 

Keywords—-floor estimation, indoor localization, received signal 
strength (RSS), z-coordinate estimation, fingerprinting localization, 
clustering, weighted centroid localization. 

I. Introduction 

Indoor localization and floor estimation in multi-floor 
buildings are becoming more and more important in today’s 
wireless world. Being able to achieve accurate ubiquitous 
localization on hand-held battery-operating mobile devices in 
both indoor and outdoor environments would open the window 
to many new Location Based Services (LBS) spanning from 
person and asset tracking and personal navigation to health 
remote monitoring and LBS-based social networking. How¬ 
ever, despite the fact that outdoor global localization solutions 
exist nowadays with the help of Global Navigation Satellite 


Systems (GNSS), global solutions for indoor localization and 
floor estimation are still hard to find. 

One of the crucial aspects in indoor positioning is to 
accurately identify the floor where the user is located. False 
floor estimation could lead not only to waste of time, for 
example in Location Based Services dedicated to advertising 
and shopping assistance, but also to serious injury or loss 
of lives in emergency Location Based Services. The floor 
detection or estimation can be either performed in an initial 
step, before accurate x-y localization, or jointly with the x-y 
localization m, 0 . In our paper we focus on the first case. 

Typically, the floor estimation is achieved via fingerprinting 
approaches with Nearest Neighbor (NN) method, and such 
approaches can solve the indoor localization problem locally, 
as Skyhook and Polaris have proved (26) . Such solutions 
are however expensive and computationally rather expen¬ 
sive to be used on a global scale (e.g. worldwide). In the 
fingerprinting-based methods (7), fl2ll . |T51 . (T9), 1ZD- the 
location service providers construct a fingerprint database, 
transfer this database to the Mobile Station (MS), and the 
MS then computes its location and corresponding floor based 
on similar fingerprints. The fingerprint databases are typically 
very large since they do contain Received Signal Strengths 
(RSS’s) coming from various Access Points (APs) and in 
many points or coordinates within a building. Thus, if a global 
floor estimation solution would use a fingerprinting approach, 
the fingerprint database transfered from the server to the MS 
would include the fingerprints from all essential buildings in 
the town (or the location area) where the mobile is situated. For 
example, assuming that i) we hear an average of 30 APs in each 
location point inside a building (a location point here refers to 
the ( x,y,z ) coordinates inside a building where a measurement 
is done), ii) we take measurements from an average of 600 
location points per building and iii) there are 25 important 
buildings (malls, shopping centers, hospitals, airports,...) in the 
location area where the mobile was identified by the network, 
then a total of 495000 parameters would need to be stored in 
the database pertaining to that town and transfered to the mo¬ 
bile. The parameters are the fingerprints, namely the (x, y, z ) 
coordinates and the measured RSS values per coordinate (one 
RSS per each AP heard at that coordinate). In addition, if these 
parameters are saved with a 32-bit accuracy, the database size 
of such a server provider for the particular town of our example 
would be around 15.86 Mbits. With average typical cell-edge 
(coverage) rates in the order of 50-100kbps, especially in 
legacy networks, the transfer to the mobile would take several 
minutes, which is clearly unacceptable. Moreover, the server 
would have to deal with hundreds or thousands of user requests 


for localization, and this could thus easily create a bottleneck 
also on the server side. 

An alternative floor estimation approach is based only on 
the known or estimated positions of the transmitters or APs 
and some form of trilateration 0 , a. The simplest of such 
approaches is a weighted centroid approach, described for 
example in a, a, ©. While the complexity of such an 
approach is much less than the one of the NN fingerprinting, 
their performance leaves a lot to be desired. 

Thus to solve the problem of huge databases and to increase 
the estimation speed, while still achieving high floor detection 
probabilities, we propose in this paper a novel clustering 
method, which achieves and outperforms the floor detection 
probability of NN fingerprinting, but with much lower com¬ 
plexity. The proposed approach is not limited to the WiFi 
fingerprints, but rather can be applied to the data collected 
from various other wireless technologies that support RSS 
measurements, such as cellular data, BLE data, RFID data, 
etc. 

Related works and the novelty of our work: Clustering 
of the fingerprints has been already studied in several papers 
m, ED, tm gq, ei. In all of these papers, the idea 
is to divide the fingerprints into several clusters where the 
size of each cluster is much smaller than the whole set of 
fingerprints. Then the positioning phase includes two stages: 
first, the MS observation vector is compared to all CH’s, and 
after finding the most similar cluster, in the second stage 
the comparisons are done within that cluster. In other words, 
the whole fingerprinting data is needed for localization. One 
cannot solely rely on the CH’s to perform the positioning 
task. This might not be a problem when the localization task 
is carried out in the server side as normally servers have 
powerful processing capability and sufficient power supply, but 
if the mobile device itself wants to accomplish the positioning 
task, then the server has to send its whole data to the mobile 
device which is impractical because of both the transmission of 
the huge dataset and limited processing capability and power 
supply on mobile devices GO¬ 
TO the best of our knowledge, the method proposed in 
this paper is the only attempt in clustering the fingerprinting 
data which needs only the CH’s information for ^-coordinate 
positioning task and therefore is implementable on mobile 
devices. A more detailed comparison between our proposed 
clustering method and the existing clustering methods for 
localization will be given in Section lHI-Cl after introducing our 
method. A numerical comparison is also carried out against the 
clustering approach proposed in lfT3l . 

Paper Organization: In Section |TT] the basics of two 
conventional methods for indoor localization are described. 
The proposed clustering approach for floor estimation is then 
introduced in Section [HI] We will also explain the essence 
of existing clustering algorithms and provide a brief analytical 
comparison between the complexity of our proposed clustering 
method and the existing clustering algorithms. In Section HV1 
we study the performance of the proposed algorithm in terms 
of probability of floor detection and computational complexity 
against the two conventional method described in Section [III 
as well as an existing clustering approach based on real-life 
measurements in four different multi-storey buildings. Finally 


we conclude the paper in Section [V] 

Mathematical notations: Throughout this paper vectors 
and matrices are shown with the small and capital bold letters, 
respectively. The second norm of a vector is denoted by || - 
|| 2 . For a set Ai, the cardinality of the set is shown as \M.\. 
Equality is denoted by = and definition is denoted by =. For 
a real number a, the smallest integer number bigger than or 
equal to a is denoted by [a]. 

II. Conventional Approaches for Indoor 
Localization 

In this section, we shortly describe two conventional 
methods for indoor localization: the Nearest Neighbor (NN) 
fingerprinting localization which is a high-complexity but 
very promising method, and Weighted Centroid Localization 
(WCL) which is a low-complexity method but with a perfor¬ 
mance noticeably inferior to fingerprinting approach. 

A. Fingerprinting Localization and Problem Statement 

Consider a localization system equipped with N ap position¬ 
ing signals (e.g., RSS values received from APs). During the 
offline phase, the positioning signals are collected in N ap x 1 
measurements vectors m„ = [m n p, m nj 2 , ■ ■ •, tn n) N ap \ T , n = 
1 ,...,Nf p , where Nf p is the number of fingerprints col¬ 
lected in the building and m n ^ ap is the RSS received from 
access point ap at n-th collected fingerprint. The corre¬ 
sponding known 3-D location of m„ is denoted by c n = 
[xn.ynAn ] 1 , 7i = 1,..., Nf p . In fingerprinting approach, the 
fingerprints {m„, c„ }. n = 1 ,..., Nf p are stored and used 
for localization purposes. 

Assume that an MS observes a positioning vector uims — 
[771ms,i, 7tims, 2, • ■ •, 7nMS,N ap ] T , where rriMS.ap is the RSS 

from ap-th AP. The basic 1-NN fingerprinting approach es- 


timates the location of the MS as 


^ms,/p Fi 5 

(1) 

where 


j = arg min d (m M s , m n ) 

(2) 

where df, •) is a dissimilarity measure which is determined 
based on our assumption for noise. For instance if we assume 
that the noise which deviates the rriMS from m„ is i.i.d white 
Gaussian, then d(mMS, m n) is simply the squared Euclidean 
distance between ihms and m„, i.e.. 

d(m M s,m„) = m M s - m „|\%. 

(3) 


In general, fingerprint-based localization approach is a pattern 
matching approach 0, ED, 113, rooted in pattern recognition 
ED, which tries to match the pattern niMS observed by 
MS to the examples {rn„ collected in the training data 
set and chooses the location of the less dissimilar example 
(fingerprint) as the location of MS. In this regard, each element 
of measurements vector m„ is a feature of the location c„. 
On the other hand, any measured signal which depends only 
on the measurement location (regardless of noise, shadowing 
and other uncertainties), can be regarded as a feature of that 
location and used for localization using fingerprinting scheme. 




The main problem with the fingerprinting approach is the 
huge amount of data which must be stored by servers and 
transmitted to the MS to localize itself when Nf p is a large 
number. The situation becomes even more severe when the 
fingerprints are being collected all over the time. If we want 
to use fingerprinting methods for localizing the mobile device, 
it can only be done on the server side. Because of the limited 
processing capabilitiy and power supply on most of the mobile 
devices, they are not capable of storing and processing of that 
huge amount of data l27l and furthermore transmitting such 
amount of fingerprinting data from server to the mobile device 
takes a lot of time, which makes the localization by mobile 
devices impractical. 


B. Weighted Centroid Localization (WCL) 


Weighted centroid localization approach, first proposed for 
position estimation in wireless sensor networks 0, is a simple 
and low-complexity but promising localization approach. The 
position of the MS in the WCL approach is computed as the 
weighted average of the positions of APs heard by the MS. 
Denoting the set of all hearable APs by LI and the (known) 
coordinates of APs by c ap = ( x ap , y api z ap ), ap = 1,..., \LL\, 
the WCL-based estimate of mobile station coordinates is 
computed as 


CMS 
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(4) 


where w ap are weight functions. To weight shorter distances 
(nearer APs) more than higher distances, w ap may be chosen 
as |D 

tCap — l/(t^ap)^) (5) 


where d ap is the distance between ap-th AP and the MS, and 
degree g is to ensure that remote APs still impact the position 
estimation 0. 


Since the distances d ap are not readily available, and also 
since RSS heard from AP ap is inversely proportional to d ap , 
the weights w ap in © can be replaced by RSS to obtain the 
following RSS-based formula for WCL fTT), {23), E) 


_ Eqpgtt m MS ,ap^ap 
CmS,u!c = ^ 

Z-^apeH m MS ,ap 


( 6 ) 


where Toms, ap is the measured RSS of AP number ap by MS. 

Equation ([6) can be written independently for each coor¬ 
dinate. For instance, for the height coordinate (which is the 
coordinate that matters in floor estimation task) we have 


and not known in advance. Therefore, to be able to use the 
WCL approach, we have to first estimate the location of APs 
in the training phase. To estimate them, we can again employ 
the WCL approach and apply it on collected fingerprinting 
data to estimate the AP coordinates. Let us denote the set of 
fingerprint measurements who hear the AP ap by T ap and the 
RSSI heard at n-th fingerprint from ap-th AP by m r: . ap - Then 
the coordinates of the AP ap can be calculated as 

EragTap m n,apC n 

C a p = — -^-, (8) 

l^neTap m n,ap 

From © it is clear that for performing the floor estimation 
task, the only thing that we need to store at the mobile 
side is N ap numbers z ap , ap = 1,..., N ap which are the z- 
coordinates of Access Points. This makes the WCL approach 
a promising one from complexity point of view. However, the 
performance of the method is relatively poor because the model 
suggested by © is an inaccurate model. 


III. A'-means algorithm for Fingerprint 
Clustering 

In this section, we will introduce a novel algorithm for 
substantially reducing the size of fingerprinting data using K- 
means clustering algorithm. The proposed algorithm achieves 
a performance very close to the NN fingerprinting approach 
while decreasing the computational complexity both in terms 
of search time and memory size needed for storing the offline 
data. We first briefly describe the basic form of k-means clus¬ 
tering and then introduce our floorwise clustering approach. 
Finally we compare the complexity of the proposed method 
with existing fingerprinting clustering approaches. 


A. K-means clustering 

AT-means clustering m is a vector quantization method for 
finding the CH’s in a set of unlabeled data E). The A'-means 
algorithm aims to partition N if-dimensional observation 
vectors {xi,...,xjv} into N c < N clusters by iteratively 
moving the CH’s {fJ-k}k=i' which are the representatives of 
clusters, to minimize the within cluster sum-of-squares 

N c 

- Vk\\h w 

k—1 Xi EC*. 


_ EapgH TO MS ,apZ-ap 

^MS, wc = ^ j (') 

EapgW TO MS, ap 

The APs deployed in commercial and privately-owned build¬ 
ings, such as shopping malls or blocks of flats, are typically 
owned by various owners and thus their locations are not cen¬ 
tralized or even known in totality. In industrial and university 
buildings, the AP location may be known to some extent, but 
as seen in our measurement campaigns, such information is 
typically stored in incomplete or inexact form, because it is not 
considered important from the communication point of view. 
For these reasons, we assume in our work that the location 
of APs is estimated based on the available fingerprint data, 


with respect to {Ci,C 2 ,... ,Cn c }, where Ck denotes the fc-th 
cluster. Minimizing the objective function in © in general is 
NP-hard ©. But the A'-means algorithm provides a suboptimal 
solution to it by alternating the following two steps in each 
iteration until convergence ifTT) : 

1) for each CH, identify the subset of data points which 
are closer to it than any other CH; and 

2) compute the mean of all data points belonging to each 
CH and take it as the new CH. 

The above-mentioned A"-means iterations guarantee the con¬ 
vergence to a stationary point of © ||20) . 






B. Fingerprint clustering for floor detection 

When the goal of localization is to detect the floor, for 
example in a multistorey building, the clustering can be applied 
by clustering the fingerprints collected in each floor separately. 
The number of clusters can be different from one floor to 
another. 

We want to cluster the fingerprints in a building with F 
floors with the ultimate goal to use the CH’s for floor detection. 
Assume that the set of all fingerprints is partitioned as T = 
Jill.. .U Tf, where J 7 / denotes the set of fingerprints collected 
in floor /. The floorwise fingerprint clustering then can be 
accomplished by applying /('-means clustering algorithm to 
vectors m n 6 Tj for each /, separately. In the detection phase, 
the CH’s in all floors are compared to the MS observation 
vector rriMS and the floor of the most similar cluster head is 
chosen as our estimate of the floor. Mathematically, denoting 
the set of the CH’s in /-th floor by {m/,i,..., / } where 

N c j is the number of clusters in the /-th floor, we have 

/= arg mind(m M s,m /ifc ). (10) 

This method is in general referred to as //-means classifi¬ 
cation ED Chapter 13] in the literature. 

C. Comparison to the existing algorithms 

To further elaborate the the novelty of the proposed al¬ 
gorithm, here we compare it with the existing fingerprint 
clustering algorithms based on the formulation given above. 

We denote the set of all fingerprint measurements Af = 
{nil, ■ ■ •. ni/Yjp} and the set of the corresponding fingerprint 
coordinates by C = {ci,..., c v /( ,}. Furthermore, we denote 
the set of resulting cluster heads by Ad = {mi,..., fhjv c } and 
the set of measurement vectors in c-th cluster by A4 C , c = 
1,...,N C . Clearly we have Af = U^ 1 A4 C . The existing 
fingerprinting clustering methods all include the following 
three steps: 

51 First we find the most similar element of the set A4 
to mMS- Denote this element by mg. 

52 Then we find the most similar element of the set 
Adg to mMS- Denote this element by m -. Notice 
that m- € Ad. 

53 Finally, our estimate for the coordinate of MS is c- G 
C. 

The main problem of these methods, which has been solved in 
our proposed algorithm, is that in order to carry out step S2, we 
have to save all the fingerprints. The only advantage of these 
clustering algorithms over the ordinary fingerprint matching 
methods is that after finding the most matched cluster head, 
the search is limited within only that cluster instead of the 
whole data base. But since it is not known in advance what 
cluster will be chosen later in step SI, these methods still 
need to save all of the cluster sets Ad 0 , c = 1,..., N c (whose 
union is equal to the whole fingerprinting database), the z- 
coordinates of all fingerprints, and the set of the cluster heads 
Ad, which altogether are of size ( N ap + 1 )Nf p + N ap N c . 

In our method, on the other hand, we do not need to save 
Ad and C. We are able to estimate the floor only from the 


cluster heads and their corresponding floor labels which is 
only of size (N ap + 1 )N C . In Section [IV] we will provide 
a numerical comparison of our method against one of the 
existing clustering approaches as well as the two conventional 
floor estimation methods explained in Section El 

IV. Measurement Results and Analysis 

In this section, we study the performance of the proposed 
clustering algorithm by selected real-life measurement exam¬ 
ples. 

A. Measurement set-up 

The numerical examples here are based on real-life WLAN 
data collected in four multi-storey buildings. The first building 
is a four-floor university building (Univ-1), the second building 
is a three-floor university building (Univ-2), the third building 
is a six-floor shopping mall (Mall), and finally the fourth 
building is a four-floor office building (Office). 

We have two sets of data for each building. The first 
data set includes the fingerprints collected in the building 
which is used for training. This is the data set to which the 
fingerprint clustering is applied and afterwards we only use 
the CH’s for floor detection purposes. We refer to this data 
set as fingerprinting data. The second set has been collected 
along several different tracks in each building, where each 
track includes tens to hundreds of data points. This data set 
will be used for examining the performance of the proposed 
algorithm and is referred to as test data. 

The measurement points for collecting fingerprinting data 
and test data in Univ-1 and Univ-2, are illustrated in Figures 
Q] and [2] respectively. Figure [3] shows the power map of a 
selected AR namely AP number 4, in the first floor of Univ-1. 
This is also the floor where AP 4 is situated. The red points 
have the highest RSS values which are in fact points closest 
to the physical location of the AP and the blue points have the 
lowest RSS values and are the points farthest from the AP. 

A summary of relevant technical details in each building 
including the number of floors Nfi. the size of fingerprinting 
data determined by Nf p , the size of test data determined by 
the number of test points N t , and the number of Access Points 
N ap heard are shown in Table Q] 

B. Numerical study of four methods for floor estimation 

As mentioned in Section ITlI-BI we apply the K-means clus¬ 
tering algorithms to the fingerprints in each floor separately. 
This task can be done on the server side when the computa¬ 
tional resources are powerful enough to apply the clustering 
algorithm to a possibly huge amount of fingerprinting data. The 
server then sends the computed cluster heads together with the 
floor label of each cluster head to the mobile device to be used 
for positioning in the online phase. In other words, the data 
needed in the mobile side for performing the floor detection 
task is only limited to the cluster heads and their corresponding 
floor labels. We remark that if we want to use the ordinary 
fingerprinting method, the server would need to send the entire 
fingerprinting data to the mobile device. Therefore by using the 
clustering algorithm, we will achieve a considerable reduction 
in the size of data needed to be transmitted to the mobile device 
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Fig. 1. Measurement points for collecting fingerprinting data and test data 
in Univ-1. 


o Fingerprinting data 
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Fig. 2. Measurement points for collecting fingerprinting data and test data 
in Univ-2. 

TABLE I. Number of floors Nfi, number of training samples 

(FINGERPRINTING DATA) Nf p , NUMBER OF TEST SAMPLES Nt, AND 
NUMBER OF ACCESS POINTS Nap IN EACH OF THE FOUR BUILDINGS. 


Building 

N fl 

N fv 

N t 

N ap 

Univ-1 

4 

16080 

6796 

509 

Univ-2 

3 

9923 

2301 

489 

Mall 

6 

1633 

3503 

468 

Office 

4 

354 

3873 

1103 


for floor detection. The size reduction provides two benefits in 
the mobile side: first, the memory size required for storing the 
data (cluster heads) is less than the ordinary fingerprinting, 
and second, the complexity of the search for finding the most 
similar cluster head decreases significantly. 

Tables HD Hill and m show, respectively, the probability of 
floor detection, the time for floor estimation in online phastf], 
and the size of data needed on the mobile device for four differ- 


The time is the running time of floor detection algorithm on a computer 
with 2.2 GHz Intel Core i7 CPU and 16 GB of random access memory (RAM). 



Fig. 3. Power map (the map of RSS values) of the first floor of Univ-1 for 
AP number 4. The figure shows how the received signal strength (RSS) from 
AP number 4 is distributed over the floor. 


ent methods: ordinary 1-Nearest Neighbor (1-NN) fingerprint 
positioning, the Weighted centroid localization (WCL) method, 
the clustering approach proposed in ff3l . and our proposed 
floorwise clustering algorithm. For the method proposed in 
m and also for our approach, the clustering is implemented 
for three different clustering ratios p £ {0.01,0.05,0.1}, 
where the clustering ratio here has the following relationship 
with the number of cluster heads in /-th floor, iV c h./, and the 
total number of fingerprints in /-th floor, Nf p j: 

N chJ =\pxN ipJ ]. (11) 

As it can be seen from Table HD even with the very 
small clustering ratio of p = 0.01, the performance of the 
proposed floorwise clustering approach is very close to the 1- 
NN fingerprinting approach for the first two buildings. The 
performance for the next two buildings, namely Mall and 
Office, however degrades which is mainly because of relatively 
small number of fingerprints collected in these two buildings 
(see table |D. The performance is clearly superior to the WCL 
approach. The performance of tni is similar to the 1-NN 
fingerprinting approach except for the first building (which 
has the highest number of fingerprints) where its performance 
degrades significantly. This is because when the number of 
fingerprints is very large, it is more likely that the method 
in lfl3l puts the fingerprints from different floors to the same 
cluster which may eventually results in an erroneous estimate 
for the floor. 

For higher values of p on the other hand, the performance 
of our proposed method becomes very similar to 1-NN finger¬ 
printing. For some points it even slightly outperforms the 1- 
NN fingerprinting which is because like any other compression 
algorithm the clustering approach provides some denoising too. 

From Table [TIT] we can see that the time needed for 
computing the estimated floor using our proposed clustering 
approaches is much faster than that of ordinary fingerprinting 
approach. It can be seen that the proposed method is also faster 
than the clustering approach of m and is only inferior to 
WCL which has a rather poor floor estimation performance. 

From Table II VI it can be seen that the size of data needed at 
the mobile side for our method is noticeably smaller than those 



































TABLE II. 


Probability of Floor Detection for each method. 


Building 

1-NN 

WCL 

CCD 

Proposed Floorwise Clustering 

p - 0.01 

p - 0.05 

p = 0.1 

p = 0.01 

p - 0.05 

p = 0.1 

Univ-1 

0.8868 

0.7016 

0.7076 

0.7088 

0.7056 

0.8973 

0.8996 

0.9036 

Univ-2 

0.9944 

0.6784 

0.9831 

0.9904 

0.9904 

0.9739 

0.9930 

0.9870 

Mall 

0.9255 

0.6041 

0.8927 

0.9146 

0.9212 

0.8401 

0.9055 

0.9366 

Office 

0.8084 

0.7033 

0.8012 

0.7896 

0.8066 

0.7088 

0.7981 

0.8043 


TABLE III. Elapsed time (in seconds) for floor estimation using each method. 


Building 

1-NN 

WCL 

lld 

Proposed Floorwise Clustering 

p - 0.01 

p = 0.05 

p = 0.1 

p = 0.01 

p = 0.05 

p = 0.1 

Univ-1 

32.62 

0.066 

0.67 

1.229 

2.769 

0.35 

1.25 

3.92 

Univ-2 

5.935 

0.026 

0.228 

0.284 

0.511 

0.095 

0.258 

0.772 

Mall 

0.832 

0.031 

0.171 

0.155 

0.195 

0.069 

0.100 

0.180 

Office 

0.366 

0.035 

0.218 

0.160 

0.162 

0.084 

0..095 

0.137 


TABLE IV. The size of data (in Kilo Bytes) needed at the mobile end for performing the floor detection task. 


Building 

1-NN 

WCL 


Proposed Floorwise Clustering 

p - 0.01 

p - 0.05 

p = 0.1 

p - 0.01 

p = 0.05 

p = 0.1 

Univ-1 

864 

0.412 

942 

1100 

1300 

78 

319 

692 

Univ-2 

553 

0.389 

602 

709 

811 

53 

203 

451 

Mall 

315 

0.534 

332 

356 

381 

20 

61 

123 

Office 

262 

0.663 

274 

295 

315 

20 

53 

106 


of 1-NN fingerprinting approach and the clustering approach 
of Q3), especially for small clustering ratios. The method of 
m needs the highest memory size that as explained in Section 
mm is because it has to store the cluster heads in addition 
to all fingerprints. 

Finally, Figure 0] illustrates an overall comparison between 
the four mentioned methods based on the three performance 
metrics of i) the elapsed time for computing the floor esti¬ 
mation, ii) the data size needed in the mobile side, and iii) 
the probability of floor detection. The clustering ratio for the 
clustering method of m and our proposed clustering method 
is p = 0.01. The colors are to distinguish between the methods 
and the four vertices of each pyramid are corresponding to the 
four buildings under study in the experiment. As it can be 
seen the proposed method (red pyramid) is much closer to 
the origin of the horizontal plane than ED (blue pyramid) 
and 1-NN (cyan pyramid) which means it has a much lower 
complexity than them. It is not as close as the WCL (magenta 
pyramid), but instead it delivers a much better floor detection 
probability than WCL. Thus, overall, the proposed method 
provides excellent floor detection performance while being 
able to reduce the computing and data transfer complexities 
in a substantial manner. 

V. Conclusion 

With the goal of substantially reducing the size of fin¬ 
gerprinting data needed for storage and transmission in floor 
estimation, a method for clustering fingerprints using /Cmeans 
algorithm was proposed. The proposed technique applies the 
clustering algorithm floorwise and keeps and transmits only the 
cluster heads together with their corresponding floor labels for 
floor estimation. The performance of the proposed methods 
was evaluated with comprehensive real-life indoor measure¬ 
ments. The obtained results show that while the proposed 
method delivers a significant enhancement in the speed of floor 
estimation algorithm and a substantial reduction in the size of 
fingerprint database needed at the mobile device, its localiza- 



Fig. 4. Comparison between the four methods based on the 3 performance 
metrics, namely elapsed time, data size, and floor detection probability. The 
clustering ratio for CCD and our proposed method is p = 0.01. 


tion performance is in par with the conventional fingerprinting 
approach which uses all the data for accomplishing localization 
task. 
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